Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译
Graph contrastive learning is an important method for deep graph clustering. The existing methods first generate the graph views with stochastic augmentations and then train the network with a cross-view consistency principle. Although good performance has been achieved, we observe that the existing augmentation methods are usually random and rely on pre-defined augmentations, which is insufficient and lacks negotiation between the final clustering task. To solve the problem, we propose a novel Graph Contrastive Clustering method with the Learnable graph Data Augmentation (GCC-LDA), which is optimized completely by the neural networks. An adversarial learning mechanism is designed to keep cross-view consistency in the latent space while ensuring the diversity of augmented views. In our framework, a structure augmentor and an attribute augmentor are constructed for augmentation learning in both structure level and attribute level. To improve the reliability of the learned affinity matrix, clustering is introduced to the learning procedure and the learned affinity matrix is refined with both the high-confidence pseudo-label matrix and the cross-view sample similarity matrix. During the training procedure, to provide persistent optimization for the learned view, we design a two-stage training strategy to obtain more reliable clustering information. Extensive experimental results demonstrate the effectiveness of GCC-LDA on six benchmark datasets.
translated by 谷歌翻译
Graph anomaly detection (GAD) is a vital task in graph-based machine learning and has been widely applied in many real-world applications. The primary goal of GAD is to capture anomalous nodes from graph datasets, which evidently deviate from the majority of nodes. Recent methods have paid attention to various scales of contrastive strategies for GAD, i.e., node-subgraph and node-node contrasts. However, they neglect the subgraph-subgraph comparison information which the normal and abnormal subgraph pairs behave differently in terms of embeddings and structures in GAD, resulting in sub-optimal task performance. In this paper, we fulfill the above idea in the proposed multi-view multi-scale contrastive learning framework with subgraph-subgraph contrast for the first practice. To be specific, we regard the original input graph as the first view and generate the second view by graph augmentation with edge modifications. With the guidance of maximizing the similarity of the subgraph pairs, the proposed subgraph-subgraph contrast contributes to more robust subgraph embeddings despite of the structure variation. Moreover, the introduced subgraph-subgraph contrast cooperates well with the widely-adopted node-subgraph and node-node contrastive counterparts for mutual GAD performance promotions. Besides, we also conduct sufficient experiments to investigate the impact of different graph augmentation approaches on detection performance. The comprehensive experimental results well demonstrate the superiority of our method compared with the state-of-the-art approaches and the effectiveness of the multi-view subgraph pair contrastive strategy for the GAD task.
translated by 谷歌翻译
自动检测视网膜结构,例如视网膜血管(RV),凹起的血管区(FAZ)和视网膜血管连接(RVJ),对于了解眼睛的疾病和临床决策非常重要。在本文中,我们提出了一种新型的基于投票的自适应特征融合多任务网络(VAFF-NET),用于在光学相干性层析成像(OCTA)中对RV,FAZ和RVJ进行联合分割,检测和分类。提出了一个特定于任务的投票门模块,以适应并融合两个级别的特定任务的不同功能:来自单个编码器的不同空间位置的特征,以及来自多个编码器的功能。特别是,由于八八座图像中微脉管系统的复杂性使视网膜血管连接连接到分叉/跨越具有挑战性的任务的同时定位和分类,因此我们通过结合热图回归和网格分类来专门设计任务头。我们利用来自各种视网膜层的三个不同的\ textit {en face}血管造影,而不是遵循仅使用单个\ textit {en face}的现有方法。为了促进进一步的研究,已经发布了这些数据集的部分数据集,并已发布了公共访问:https://github.com/imed-lab/vaff-net。
translated by 谷歌翻译
多视图聚类(MVC)最佳地集成了来自不同视图的互补信息,以提高聚类性能。尽管在各种应用中证明了有希望的性能,但大多数现有方法都直接融合了多个预先指定的相似性,以学习聚类的最佳相似性矩阵,这可能会导致过度复杂的优化和密集的计算成本。在本文中,我们通过对齐方式最大化提出了晚期Fusion MVC,以解决这些问题。为此,我们首先揭示了现有K-均值聚类的理论联系以及基本分区和共识之一之间的对齐。基于此观察结果,我们提出了一种简单但有效的多视算法,称为LF-MVC-GAM。它可以从每个单独的视图中最佳地将多个源信息融合到分区级别,并最大程度地将共识分区与这些加权基础分区保持一致。这种对齐方式有助于整合分区级别信息,并通过充分简化优化过程来大大降低计算复杂性。然后,我们设计了另一个变体LF-MVC-LAM,以通过在多个分区空间之间保留局部内在结构来进一步提高聚类性能。之后,我们开发了两种三步迭代算法,以通过理论上保证的收敛来解决最终的优化问题。此外,我们提供了所提出算法的概括误差约束分析。对十八个多视图基准数据集进行了广泛的实验,证明了拟议的LF-MVC-GAM和LF-MVC-LAM的有效性和效率,范围从小到大型数据项不等。拟议算法的代码可在https://github.com/wangsiwei2010/latefusionalignment上公开获得。
translated by 谷歌翻译
聚类是一种代表性的无监督方法,广泛应用于多模式和多视图方案。多个内核聚类(MKC)旨在通过集成基础内核的互补信息来分组数据。作为代表,后期的Fusion MKC首先将内核分解为正交分区矩阵,然后从他们那里学习共识,最近实现了有希望的表现。但是,这些方法无法考虑分区矩阵内部的噪声,从而阻止了聚类性能的进一步改善。我们发现噪声可以分解为可分离的双部分,即n-noise和c-noise(空空间噪声和柱空间噪声)。在本文中,我们严格地定义了双噪声,并通过最小化新颖的无参数MKC算法提出了新颖的MKC算法。为了解决最终的优化问题,我们设计了有效的两步迭代策略。据我们所知,这是第一次研究内核空间中分区中的双重噪声。我们观察到双重噪声会污染对角线结构并产生聚类性能的变性,而C-Noise比N-Noise表现出更大的破坏。由于我们的有效机制可以最大程度地减少双重噪声,因此所提出的算法超过了最新的方法。
translated by 谷歌翻译
多个内核聚类(MKC)致力于从一组基础内核中实现最佳信息融合。事实证明,构建精确和局部核矩阵在应用中具有至关重要的意义,因为不可靠的远距离相似性估计将降低群集的每种形式。尽管与全球设计的竞争者相比,现有的局部MKC算法表现出改善的性能,但其中大多数通过考虑{\ tau} - 最终的邻居来定位内核矩阵来定位内核矩阵。但是,这种粗糙的方式遵循了一种不合理的策略,即不同邻居的排名重要性是相等的,这在应用程序中是不切实际的。为了减轻此类问题,本文提出了一种新型的本地样品加权多核聚类(LSWMKC)模型。我们首先在内核空间中构建共识判别亲和力图,从而揭示潜在的局部结构。此外,学习亲和力图的最佳邻域内核具有自然稀疏特性和清晰的块对角结构。此外,LSWMKC立即优化了具有相应样品的不同邻居的适应性权重。实验结果表明,我们的LSWMKC具有更好的局部流形表示,并且优于现有内核或基于图的聚类算法算法。可以从https://github.com/liliangnudt/lswmkc公开访问LSWMKC的源代码。
translated by 谷歌翻译
近年来,图形神经网络(GNNS)在半监督节点分类中实现了有希望的性能。但是,监督不足的问题以及代表性崩溃,在很大程度上限制了GNN在该领域的性能。为了减轻半监督场景中节点表示的崩溃,我们提出了一种新型的图形对比学习方法,称为混合图对比度网络(MGCN)。在我们的方法中,我们通过扩大决策边界的边距并提高潜在表示的跨视图一致性来提高潜在特征的歧视能力。具体而言,我们首先采用了基于插值的策略来在潜在空间中进行数据增强,然后迫使预测模型在样本之间进行线性更改。其次,我们使学习的网络能够通过强迫跨视图的相关矩阵近似身份矩阵来分开两个插值扰动视图的样品。通过结合两个设置,我们从丰富的未标记节点和罕见但有价值的标记节点中提取丰富的监督信息,以进行判别表示学习。六个数据集的广泛实验结果证明了与现有最​​新方法相比,MGCN的有效性和普遍性。
translated by 谷歌翻译
长期以来,半监督学习(SSL)已被证明是一种有限的标签模型的有效技术。在现有的文献中,基于一致性的基于正则化的方法,这些方法迫使扰动样本具有类似的预测,而原始的样本则引起了极大的关注。但是,我们观察到,当标签变得极为有限时,例如,每个类别的2或3标签时,此类方法的性能会大大降低。我们的实证研究发现,主要问题在于语义信息在数据增强过程中的漂移。当提供足够的监督时,可以缓解问题。但是,如果几乎没有指导,错误的正则化将误导网络并破坏算法的性能。为了解决该问题,我们(1)提出了一种基于插值的方法来构建更可靠的正样品对; (2)设计一种新颖的对比损失,以指导学习网络的嵌入以在样品之间进行线性更改,从而通过扩大保证金决策边界来提高网络的歧视能力。由于未引入破坏性正则化,因此我们提出的算法的性能在很大程度上得到了改善。具体而言,所提出的算法的表现优于第二好算法(COMATT),而当CIFAR-10数据集中的每个类只有两个标签可用时,可以实现88.73%的分类精度,占5.3%。此外,我们通过通过我们提出的策略大大改善现有最新算法的性能,进一步证明了所提出的方法的普遍性。
translated by 谷歌翻译
深图形聚类,旨在揭示底层的图形结构并将节点划分为不同的群体,近年来引起了密集的关注。然而,我们观察到,在节点编码的过程中,现有方法遭受表示崩溃,这倾向于将所有数据映射到相同的表示中。因此,节点表示的鉴别能力是有限的,导致不满足的聚类性能。为了解决这个问题,我们提出了一种新颖的自我监督的深图聚类方法,通过以双向还原信息相关性来称呼双重关联减少网络(DCRN)。具体而言,在我们的方法中,我们首先将暹罗网络设计为编码样本。然后通过强制跨视图样本相关矩阵和跨视图特征相关矩阵分别近似两个标识矩阵,我们减少了双级的信息相关性,从而提高了所得特征的判别能力。此外,为了减轻通过在GCN中过度平滑引起的表示崩溃,我们引入了传播正规化术语,使网络能够利用浅网络结构获得远程信息。六个基准数据集的广泛实验结果证明了提出的DCRN对现有最先进方法的有效性。
translated by 谷歌翻译